AITopics

2603.00192

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > United States > Virginia > Albemarle County > Charlottesville (0.04)
Asia > Middle East > Saudi Arabia (0.04)
Asia > India > Maharashtra > Mumbai (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.56)

arXiv.org Artificial IntelligenceDec-10-2025

Semantic-Metric Bayesian Risk Fields: Learning Robot Safety from Human Videos with a VLM Prior

Chen, Timothy, Dominguez-Kuhne, Marcus, Swann, Aiden, Liu, Xu, Schwager, Mac

Humans interpret safety not as a binary signal but as a continuous, context- and spatially-dependent notion of risk. While risk is subjective, humans form rational mental models that guide action selection in dynamic environments. This work proposes a framework for extracting implicit human risk models by introducing a novel, semantically-conditioned and spatially-varying parametrization of risk, supervised directly from safe human demonstration videos and VLM common sense. Notably, we define risk through a Bayesian formulation. The prior is furnished by a pretrained vision-language model. In order to encourage the risk estimate to be more human aligned, a likelihood function modulates the prior to produce a relative metric of risk. Specifically, the likelihood is a learned ViT that maps pretrained features, to pixel-aligned risk values. Our pipeline ingests RGB images and a query object string, producing pixel-dense risk images. These images that can then be used as value-predictors in robot planning tasks or be projected into 3D for use in conventional trajectory optimization to produce human-like motion. This learned mapping enables generalization to novel objects and contexts, and has the potential to scale to much larger training datasets. In particular, the Bayesian framework that is introduced enables fast adaptation of our model to additional observations or common sense rules. We demonstrate that our proposed framework produces contextual risk that aligns with human preferences. Additionally, we illustrate several downstream applications of the model; as a value learner for visuomotor planners or in conjunction with a classical trajectory optimization algorithm. Our results suggest that our framework is a significant step toward enabling autonomous systems to internalize human-like risk. Code and results can be found at https://riskbayesian.github.io/bayesian_risk/.

large language model, machine learning, trajectory, (20 more...)

2512.08233

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.86)

Industry:

Materials > Containers & Packaging (0.67)
Consumer Products & Services > Personal Products (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.97)
(2 more...)

Spadon, Gabriel, Vaidheeswaran, Vaishnav, DiBacco, Claudio

A Theoretical Framework for Environmental Similarity and Vessel Mobility as Coupled Predictors of Marine Invasive Species Pathways

arXiv.org Artificial IntelligenceNov-7-2025

Marine invasive species spread through global shipping and generate substantial ecological and economic impacts. Traditional risk assessments require detailed records of ballast water and traffic patterns, which are often incomplete, limiting global coverage. This work advances a theoretical framework that quantifies invasion risk by combining environmental similarity across ports with observed and forecasted maritime mobility. Climate-based feature representations characterize each port's marine conditions, while mobility networks derived from Automatic Identification System data capture vessel flows and potential transfer pathways. Clustering and metric learning reveal climate analogues and enable the estimation of species survival likelihood along shipping routes. A temporal link prediction model captures how traffic patterns may change under shifting environmental conditions. The resulting fusion of environmental similarity and predicted mobility provides exposure estimates at the port and voyage levels, supporting targeted monitoring, routing adjustments, and management interventions.

artificial intelligence, machine learning, similarity, (11 more...)

2511.03499

Country: North America > Canada > Nova Scotia > Halifax Regional Municipality (0.14)

Genre: Research Report (0.50)

Industry: Transportation > Marine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science (0.90)

Umapathy, Lavanya, Johnson, Patricia M, Dutt, Tarun, Tong, Angela, Nayan, Madhur, Chandarana, Hersh, Sodickson, Daniel K

Context-aware deep learning using individualized prior information reduces false positives in disease risk prediction and longitudinal health assessment

arXiv.org Artificial IntelligenceOct-20-2025

Temporal context in medicine is valuable in assessing key changes in patient health over time. We developed a machine learning framework to integrate diverse context from prior visits to improve health monitoring, especially when prior visits are limited and their frequency is variable. Our model first estimates initial risk of disease using medical data from the most recent patient visit, then refines this assessment using information digested from previously collected imaging and/or clinical biomarkers. We applied our framework to prostate cancer (PCa) risk prediction using data from a large population (28,342 patients, 39,013 magnetic resonance imaging scans, 68,931 blood tests) collected over nearly a decade. For predictions of the risk of clinically significant PCa at the time of the visit, integrating prior context directly converted false positives to true negatives, increasing overall specificity while preserving high sensitivity. False positive rates were reduced progressively from 51% to 33% when integrating information from up to three prior imaging examinations, as compared to using data from a single visit, and were further reduced to 24% when also including additional context from prior clinical data. For predicting the risk of PCa within five years of the visit, incorporating prior context reduced false positive rates still further (64% to 9%). Our findings show that information collected over time provides relevant context to enhance the specificity of medical risk prediction. For a wide range of progressive conditions, sufficient reduction of false positive rates using context could offer a pathway to expand longitudinal health monitoring programs to large populations with comparatively low baseline risk of disease, leading to earlier detection and improved health outcomes.

artificial intelligence, index visit, machine learning, (13 more...)

2510.15591

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Consumer Health (1.00)
Health & Medicine > Therapeutic Area > Urology (0.92)
Health & Medicine > Therapeutic Area > Oncology > Prostate Cancer (0.52)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Jeanselme, Vincent, Tom, Brian, Barrett, Jessica

Competing Risks: Impact on Risk Estimation and Algorithmic Fairness

arXiv.org Artificial IntelligenceAug-8-2025

Accurate time-to-event prediction is integral to decision-making, informing medical guidelines, hiring decisions, and resource allocation. Survival analysis, the quantitative framework used to model time-to-event data, accounts for patients who do not experience the event of interest during the study period, known as censored patients. However, many patients experience events that prevent the observation of the outcome of interest. These competing risks are often treated as censoring, a practice frequently overlooked due to a limited understanding of its consequences. Our work theoretically demonstrates why treating competing risks as censoring introduces substantial bias in survival estimates, leading to systematic overestimation of risk and, critically, amplifying disparities. First, we formalize the problem of misclassifying competing risks as censoring and quantify the resulting error in survival estimates. Specifically, we develop a framework to estimate this error and demonstrate the associated implications for predictive performance and algorithmic fairness. Furthermore, we examine how differing risk profiles across demographic groups lead to group-specific errors, potentially exacerbating existing disparities. Our findings, supported by an empirical analysis of cardiovascular management, demonstrate that ignoring competing risks disproportionately impacts the individuals most at risk of these events, potentially accentuating inequity. By quantifying the error and highlighting the fairness implications of the common practice of considering competing risks as censoring, our work provides a critical insight into the development of survival models: practitioners must account for competing risks to improve accuracy, reduce disparities in risk assessment, and better inform downstream decisions.

data mining, machine learning, modeling, (19 more...)

2508.05435

Country:

North America > United States (0.46)
Europe > Netherlands (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Law > Civil Rights & Constitutional Law (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Zhou, Wenbin, Orfanoudaki, Agni, Zhu, Shixiang

Conformalized Decision Risk Assessment

arXiv.org Machine LearningMay-20-2025

High-stakes decisions in domains such as healthcare, energy, and public policy are often made by human experts using domain knowledge and heuristics, yet are increasingly supported by predictive and optimization-based tools. A dominant approach in operations research is the predict-then-optimize paradigm, where a predictive model estimates uncertain inputs, and an optimization model recommends a decision. However, this approach often lacks interpretability and can fail under distributional uncertainty -- particularly when the outcome distribution is multi-modal or complex -- leading to brittle or misleading decisions. In this paper, we introduce CREDO, a novel framework that quantifies, for any candidate decision, a distribution-free upper bound on the probability that the decision is suboptimal. By combining inverse optimization geometry with conformal prediction and generative modeling, CREDO produces risk certificates that are both statistically rigorous and practically interpretable. This framework enables human decision-makers to audit and validate their own decisions under uncertainty, bridging the gap between algorithmic tools and real-world judgment.

artificial intelligence, data mining, machine learning, (18 more...)

2505.13243

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Indiana > Marion County > Indianapolis (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
(3 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine (1.00)
Energy > Power Industry (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining (0.88)

arXiv.org Machine LearningApr-16-2025

Predictive Multiplicity in Survival Models: A Method for Quantifying Model Uncertainty in Predictive Maintenance Applications

Cavus, Mustafa

In many applications, especially those involving prediction, models may yield near-optimal performance yet significantly disagree on individual-level outcomes. This phenomenon, known as predictive multiplicity, has been formally defined in binary, probabilistic, and multi-target classification, and undermines the reliability of predictive systems. However, its implications remain unexplored in the context of survival analysis, which involves estimating the time until a failure or similar event while properly handling censored data. We frame predictive multiplicity as a critical concern in survival-based models and introduce formal measures -- ambiguity, discrepancy, and obscurity -- to quantify it. This is particularly relevant for downstream tasks such as maintenance scheduling, where precise individual risk estimates are essential. Understanding and reporting predictive multiplicity helps build trust in models deployed in high-stakes environments. We apply our methodology to benchmark datasets from predictive maintenance, extending the notion of multiplicity to survival models. Our findings show that ambiguity steadily increases, reaching up to 40-45% of observations; discrepancy is lower but exhibits a similar trend; and obscurity remains mild and concentrated in a few models. These results demonstrate that multiple accurate survival models may yield conflicting estimations of failure risk and degradation progression for the same equipment. This highlights the need to explicitly measure and communicate predictive multiplicity to ensure reliable decision-making in process health management.

data mining, machine learning, prediction, (20 more...)

2504.12156

Country:

Asia > Middle East > Republic of Türkiye > Eskisehir Province > Eskisehir (0.04)
Europe > Switzerland (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Law > Civil Rights & Constitutional Law (0.35)
Health & Medicine > Consumer Health (0.35)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Ildiz, M. Emrullah, Gozeten, Halil Alperen, Taga, Ege Onur, Mondelli, Marco, Oymak, Samet

High-dimensional Analysis of Knowledge Distillation: Weak-to-Strong Generalization and Scaling Laws

arXiv.org Machine LearningOct-24-2024

A growing number of machine learning scenarios rely on knowledge distillation where one uses the output of a surrogate model as labels to supervise the training of a target model. In this work, we provide a sharp characterization of this process for ridgeless, high-dimensional regression, under two settings: (i) model shift, where the surrogate model is arbitrary, and (ii) distribution shift, where the surrogate model is the solution of empirical risk minimization with out-of-distribution data. In both cases, we characterize the precise risk of the target model through non-asymptotic bounds in terms of sample size and data distribution under mild conditions. As a consequence, we identify the form of the optimal surrogate model, which reveals the benefits and limitations of discarding weak features in a data-dependent fashion. In the context of weak-to-strong (W2S) generalization, this has the interpretation that (i) W2S training, with the surrogate as the weak model, can provably outperform training with strong labels under the same data budget, but (ii) it is unable to improve the data scaling law. We validate our results on numerical experiments both on ridgeless regression and on neural network architectures.

artificial intelligence, machine learning, surrogate-to-target model, (19 more...)

2410.18837

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > Afghanistan > Parwan Province > Charikar (0.04)
North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
Europe > Austria (0.04)

Genre: Research Report (0.85)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Nobel, Parth T., LeJeune, Daniel, Candès, Emmanuel J.

RandALO: Out-of-sample risk estimation in no time flat

arXiv.org Machine LearningSep-15-2024

Training machine learning models is an often expensive process, especially in large data settings. Not only is there significant cost in the fitting of individual models, but even more importantly, the best model must be chosen from a set of candidates parameterized by a set of "hyperparameters" indexing the models, and each of these models must be fitted and evaluated in order to make the optimal selection. As a result, model selection, also called hyperparameter tuning, tends to be the most computationally expensive part of the machine learning pipeline. In order to evaluate models, we typically need to set aside unseen "holdout" data to estimate the risk of the model on new samples from the training distribution. When we have an abundance of training samples, such as in the millions or billions, we can afford to set aside a modest holdout set of tens of thousands of examples without compromising model performance.

jacobian-vector product, randalo, risk estimate, (15 more...)

2409.09781

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.65)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

arXiv.org Artificial IntelligenceJun-13-2024

Assessing Model Generalization in Vicinity

Liu, Yuchi, Sun, Yifan, Wang, Jingdong, Zheng, Liang

This paper evaluates the generalization ability of classification models on out-of-distribution test sets without depending on ground truth labels. Common approaches often calculate an unsupervised metric related to a specific model property, like confidence or invariance, which correlates with out-of-distribution accuracy. However, these metrics are typically computed for each test sample individually, leading to potential issues caused by spurious model responses, such as overly high or low confidence. To tackle this challenge, we propose incorporating responses from neighboring test samples into the correctness assessment of each individual sample. In essence, if a model consistently demonstrates high correctness scores for nearby samples, it increases the likelihood of correctly predicting the target sample, and vice versa. The resulting scores are then averaged across all test samples to provide a holistic indication of model accuracy. Developed under the vicinal risk formulation, this approach, named vicinal risk proxy (VRP), computes accuracy without relying on labels. We show that applying the VRP method to existing generalization indicators, such as average confidence and effective invariance, consistently improves over these baselines both methodologically and experimentally. This yields a stronger correlation with model accuracy, especially on challenging out-of-distribution test sets.

accuracy, risk proxy, vicinal, (15 more...)

2406.09257

Country:

Asia > Middle East > Israel (0.04)
North America > Canada > Ontario > Toronto (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report > Experimental Study (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)